91 research outputs found
Sharing deep generative representation for perceived image reconstruction from human brain activity
Decoding human brain activities via functional magnetic resonance imaging
(fMRI) has gained increasing attention in recent years. While encouraging
results have been reported in brain states classification tasks, reconstructing
the details of human visual experience still remains difficult. Two main
challenges that hinder the development of effective models are the perplexing
fMRI measurement noise and the high dimensionality of limited data instances.
Existing methods generally suffer from one or both of these issues and yield
dissatisfactory results. In this paper, we tackle this problem by casting the
reconstruction of visual stimulus as the Bayesian inference of missing view in
a multiview latent variable model. Sharing a common latent representation, our
joint generative model of external stimulus and brain response is not only
"deep" in extracting nonlinear features from visual images, but also powerful
in capturing correlations among voxel activities of fMRI recordings. The
nonlinearity and deep structure endow our model with strong representation
ability, while the correlations of voxel activities are critical for
suppressing noise and improving prediction. We devise an efficient variational
Bayesian method to infer the latent variables and the model parameters. To
further improve the reconstruction accuracy, the latent representations of
testing instances are enforced to be close to that of their neighbours from the
training set via posterior regularization. Experiments on three fMRI recording
datasets demonstrate that our approach can more accurately reconstruct visual
stimuli
Auditory Attention Decoding with Task-Related Multi-View Contrastive Learning
The human brain can easily focus on one speaker and suppress others in
scenarios such as a cocktail party. Recently, researchers found that auditory
attention can be decoded from the electroencephalogram (EEG) data. However,
most existing deep learning methods are difficult to use prior knowledge of
different views (that is attended speech and EEG are task-related views) and
extract an unsatisfactory representation. Inspired by Broadbent's filter model,
we decode auditory attention in a multi-view paradigm and extract the most
relevant and important information utilizing the missing view. Specifically, we
propose an auditory attention decoding (AAD) method based on multi-view VAE
with task-related multi-view contrastive (TMC) learning. Employing TMC learning
in multi-view VAE can utilize the missing view to accumulate prior knowledge of
different views into the fusion of representation, and extract the approximate
task-related representation. We examine our method on two popular AAD datasets,
and demonstrate the superiority of our method by comparing it to the
state-of-the-art method
Multi-view Multi-label Fine-grained Emotion Decoding from Human Brain Activity
Decoding emotional states from human brain activity plays an important role
in brain-computer interfaces. Existing emotion decoding methods still have two
main limitations: one is only decoding a single emotion category from a brain
activity pattern and the decoded emotion categories are coarse-grained, which
is inconsistent with the complex emotional expression of human; the other is
ignoring the discrepancy of emotion expression between the left and right
hemispheres of human brain. In this paper, we propose a novel multi-view
multi-label hybrid model for fine-grained emotion decoding (up to 80 emotion
categories) which can learn the expressive neural representations and
predicting multiple emotional states simultaneously. Specifically, the
generative component of our hybrid model is parametrized by a multi-view
variational auto-encoder, in which we regard the brain activity of left and
right hemispheres and their difference as three distinct views, and use the
product of expert mechanism in its inference network. The discriminative
component of our hybrid model is implemented by a multi-label classification
network with an asymmetric focal loss. For more accurate emotion decoding, we
first adopt a label-aware module for emotion-specific neural representations
learning and then model the dependency of emotional states by a masked
self-attention mechanism. Extensive experiments on two visually evoked
emotional datasets show the superiority of our method.Comment: Accepted by IEEE Transactions on Neural Networks and Learning System
Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
There are threefold challenges in emotion recognition. First, it is difficult
to recognize human's emotional states only considering a single modality.
Second, it is expensive to manually annotate the emotional data. Third,
emotional data often suffers from missing modalities due to unforeseeable
sensor malfunction or configuration issues. In this paper, we address all these
problems under a novel multi-view deep generative framework. Specifically, we
propose to model the statistical relationships of multi-modality emotional data
using multiple modality-specific generative networks with a shared latent
space. By imposing a Gaussian mixture assumption on the posterior approximation
of the shared latent variables, our framework can learn the joint deep
representation from multiple modalities and evaluate the importance of each
modality simultaneously. To solve the labeled-data-scarcity problem, we extend
our multi-view model to semi-supervised learning scenario by casting the
semi-supervised classification problem as a specialized missing data imputation
task. To address the missing-modality problem, we further extend our
semi-supervised multi-view model to deal with incomplete data, where a missing
view is treated as a latent variable and integrated out during inference. This
way, the proposed overall framework can utilize all available (both labeled and
unlabeled, as well as both complete and incomplete) data to improve its
generalization ability. The experiments conducted on two real multi-modal
emotion datasets demonstrated the superiority of our framework.Comment: arXiv admin note: text overlap with arXiv:1704.07548, 2018 ACM
Multimedia Conference (MM'18
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion
Reconstructing visual stimuli from brain recordings has been a meaningful and
challenging task. Especially, the achievement of precise and controllable image
reconstruction bears great significance in propelling the progress and
utilization of brain-computer interfaces. Despite the advancements in complex
image reconstruction techniques, the challenge persists in achieving a cohesive
alignment of both semantic (concepts and objects) and structure (position,
orientation, and size) with the image stimuli. To address the aforementioned
issue, we propose a two-stage image reconstruction model called MindDiffuser.
In Stage 1, the VQ-VAE latent representations and the CLIP text embeddings
decoded from fMRI are put into Stable Diffusion, which yields a preliminary
image that contains semantic information. In Stage 2, we utilize the CLIP
visual feature decoded from fMRI as supervisory information, and continually
adjust the two feature vectors decoded in Stage 1 through backpropagation to
align the structural information. The results of both qualitative and
quantitative analyses demonstrate that our model has surpassed the current
state-of-the-art models on Natural Scenes Dataset (NSD). The subsequent
experimental findings corroborate the neurobiological plausibility of the
model, as evidenced by the interpretability of the multimodal feature employed,
which align with the corresponding brain responses.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1413
Evaluation of multiple voxel-based morphometry approaches and applications in the analysis of white matter changes in temporal lobe epilepsy
Abstract. The purpose of this study was to compare multiple voxel-based morphometry (VBM) approaches and analyze the whole-brain white matter (WM) changes in the unilateral temporal lobe epilepsy (TLE) patients relative to controls. In our study, the performance of the VBM approaches, including standard VBM, optimized VBM and VBM-DARTEL, was evaluated via a simulation, and then these VBM approaches were applied to the real data obtained from the TLE patients and controls. The results from simulation show that VBM-DARTEL performs the best among these VBM approaches. For the real data, WM reductions were found in the ipsilateral temporal lobe, the contralateral frontal and occipital lobes, the bilateral parietal lobes, cingulated gyrus, parahippocampal gyrus and brainstem of the left-TLE patients by VBM-DARTEL, which is consistent with previous studies. Our study demonstrated that DARTEL was the most robust and reliable approach for VBM analysis
Glutamatergic and Resting-State Functional Connectivity Correlates of Severity in Major Depression – The Role of Pregenual Anterior Cingulate Cortex and Anterior Insula
Glutamatergic mechanisms and resting-state functional connectivity alterations have been recently described as factors contributing to major depressive disorder (MDD). Furthermore, the pregenual anterior cingulate cortex (pgACC) seems to play an important role for major depressive symptoms such as anhedonia and impaired emotion processing. We investigated 22 MDD patients and 22 healthy subjects using a combined magnetic resonance spectroscopy (MRS) and resting-state functional magnetic resonance imaging (fMRI) approach. Severity of depression was rated using the 21-item Hamilton depression scale (HAMD) and patients were divided into severely and mildly depressed subgroups according to HAMD scores. Because of their hypothesized role in depression we investigated the functional connectivity between pgACC and left anterior insular cortex (AI). The sum of Glutamate and Glutamine (Glx) in the pgACC, but not in left AI, predicted the resting-state functional connectivity between the two regions exclusively in depressed patients. Furthermore, functional connectivity between these regions was significantly altered in the subgroup of severely depressed patients (HAMD > 15) compared to healthy subjects and mildly depressed patients. Similarly the Glx ratios, relative to Creatine, in the pgACC were lowest in severely depressed patients. These findings support the involvement of glutamatergic mechanisms in severe MDD which are related to the functional connectivity between pgACC and AI and depression severity
- …